150 research outputs found

    In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

    Full text link
    In this work we present In-Place Activated Batch Normalization (InPlace-ABN) - a novel approach to drastically reduce the training memory footprint of modern deep neural networks in a computationally efficient way. Our solution substitutes the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery while providing straightforward applicability for existing deep learning frameworks. We obtain memory savings of up to 50% by dropping intermediate results and by recovering required information during the backward pass through the inversion of stored forward results, with only minor increase (0.8-2%) in computation time. Also, we demonstrate how frequently used checkpointing approaches can be made computationally as efficient as InPlace-ABN. In our experiments on image classification, we demonstrate on-par results on ImageNet-1k with state-of-the-art approaches. On the memory-demanding task of semantic segmentation, we report results for COCO-Stuff, Cityscapes and Mapillary Vistas, obtaining new state-of-the-art results on the latter without additional training data but in a single-scale and -model scenario. Code can be found at https://github.com/mapillary/inplace_abn

    Best Sources Forward: Domain Generalization through Source-Specific Nets

    Get PDF
    A long standing problem in visual object categorization is the ability of algorithms to generalize across different testing conditions. The problem has been formalized as a covariate shift among the probability distributions generating the training data (source) and the test data (target) and several domain adaptation methods have been proposed to address this issue. While these approaches have considered the single source-single target scenario, it is plausible to have multiple sources and require adaptation to any possible target domain. This last scenario, named Domain Generalization (DG), is the focus of our work. Differently from previous DG methods which learn domain invariant representations from source data, we design a deep network with multiple domain-specific classifiers, each associated to a source domain. At test time we estimate the probabilities that a target sample belongs to each source domain and exploit them to optimally fuse the classifiers predictions. To further improve the generalization ability of our model, we also introduced a domain agnostic component supporting the final classifier. Experiments on two public benchmarks demonstrate the power of our approach

    Robust Place Categorization With Deep Domain Generalization

    Get PDF
    Traditional place categorization approaches in robot vision assume that training and test images have similar visual appearance. Therefore, any seasonal, illumination, and environmental changes typically lead to severe degradation in performance. To cope with this problem, recent works have been proposed to adopt domain adaptation techniques. While effective, these methods assume that some prior information about the scenario where the robot will operate is available at training time. Unfortunately, in many cases, this assumption does not hold, as we often do not know where a robot will be deployed. To overcome this issue, in this paper, we present an approach that aims at learning classification models able to generalize to unseen scenarios. Specifically, we propose a novel deep learning framework for domain generalization. Our method develops from the intuition that, given a set of different classification models associated to known domains (e.g., corresponding to multiple environments, robots), the best model for a new sample in the novel domain can be computed directly at test time by optimally combining the known models. To implement our idea, we exploit recent advances in deep domain adaptation and design a convolutional neural network architecture with novel layers performing a weighted version of batch normalization. Our experiments, conducted on three common datasets for robot place categorization, confirm the validity of our contribution

    Learning Deep NBNN Representations for Robust Place Categorization

    Full text link
    This paper presents an approach for semantic place categorization using data obtained from RGB cameras. Previous studies on visual place recognition and classification have shown that, by considering features derived from pre-trained Convolutional Neural Networks (CNNs) in combination with part-based classification models, high recognition accuracy can be achieved, even in presence of occlusions and severe viewpoint changes. Inspired by these works, we propose to exploit local deep representations, representing images as set of regions applying a Na\"{i}ve Bayes Nearest Neighbor (NBNN) model for image classification. As opposed to previous methods where CNNs are merely used as feature extractors, our approach seamlessly integrates the NBNN model into a fully-convolutional neural network. Experimental results show that the proposed algorithm outperforms previous methods based on pre-trained CNN models and that, when employed in challenging robot place recognition tasks, it is robust to occlusions, environmental and sensor changes

    AdaGraph: Unifying Predictive and Continuous Domain Adaptation through Graphs

    Full text link
    The ability to categorize is a cornerstone of visual intelligence, and a key functionality for artificial, autonomous visual machines. This problem will never be solved without algorithms able to adapt and generalize across visual domains. Within the context of domain adaptation and generalization, this paper focuses on the predictive domain adaptation scenario, namely the case where no target data are available and the system has to learn to generalize from annotated source images plus unlabeled samples with associated metadata from auxiliary domains. Our contributionis the first deep architecture that tackles predictive domainadaptation, able to leverage over the information broughtby the auxiliary domains through a graph. Moreover, we present a simple yet effective strategy that allows us to take advantage of the incoming target data at test time, in a continuous domain adaptation scenario. Experiments on three benchmark databases support the value of our approach.Comment: CVPR 2019 (oral

    AutoDIAL: Automatic DomaIn Alignment Layers

    Full text link
    Classifiers trained on given databases perform poorly when tested on data acquired in different settings. This is explained in domain adaptation through a shift among distributions of the source and target domains. Attempts to align them have traditionally resulted in works reducing the domain shift by introducing appropriate loss terms, measuring the discrepancies between source and target distributions, in the objective function. Here we take a different route, proposing to align the learned representations by embedding in any given network specific Domain Alignment Layers, designed to match the source and target feature distributions to a reference one. Opposite to previous works which define a priori in which layers adaptation should be performed, our method is able to automatically learn the degree of feature alignment required at different levels of the deep network. Thorough experiments on different public benchmarks, in the unsupervised setting, confirm the power of our approach.Comment: arXiv admin note: substantial text overlap with arXiv:1702.06332 added supplementary materia

    Boosting Deep Open World Recognition by Clustering

    Get PDF
    While convolutional neural networks have brought significant advances in robot vision, their ability is often limited to closed world scenarios, where the number of semantic concepts to be recognized is determined by the available training set. Since it is practically impossible to capture all possible semantic concepts present in the real world in a single training set, we need to break the closed world assumption, equipping our robot with the capability to act in an open world. To provide such ability, a robot vision system should be able to (i) identify whether an instance does not belong to the set of known categories (i.e. open set recognition), and (ii) extend its knowledge to learn new classes over time (i.e. incremental learning). In this work, we show how we can boost the performance of deep open world recognition algorithms by means of a new loss formulation enforcing a global to local clustering of class-specific features. In particular, a first loss term, i.e. global clustering, forces the network to map samples closer to the class centroid they belong to while the second one, local clustering, shapes the representation space in such a way that samples of the same class get closer in the representation space while pushing away neighbours belonging to other classes. Moreover, we propose a strategy to learn class-specific rejection thresholds, instead of heuristically estimating a single global threshold, as in previous works. Experiments on RGB-D Object and Core50 datasets show the effectiveness of our approach.Comment: IROS/RAL 202

    Apport de l’électromyographie de surface en tennis : proposition d’une nouvelle méthode de normalisation des muscles du membre supérieur : influence de la vitesse et de la fatigue sur l’activité musculaire du membre supérieur en tennis

    Get PDF
    The main purpose of this thesis is the study of upper limb muscle activity through surface electromyography (EMG) during a dynamic activity. An initial study showed that seven out of nine muscles can be normalized from two maximum dynamic tasks, while two other muscles require the traditional isometric method. This procedure helps to improve the reliability of the upper limb EMG while reducing the time of standardization. On the other hand, the study of the relationship between EMG and stroke velocity in forehand drive in tennis emphasized the changes in EMG amplitude and activation timing of some muscles in response to the increase of the ball velocity. Otherwise, a third study showed that fatigue generated by intense exercise tennis results in a decrease in activation level of the pectoralis major and the forearm muscles during strokes, without any change in activation timing. This decrease in EMG activity could explain the performance degradation observed during this experiment. However, strategies of organism protection and/or gestion of the speed-accuracy trade-off should be considered and may need future studiesL'objet principal de cette thèse est l'étude de l'activité musculaire du membre supérieur par le biais de l'électromyographie de surface (EMG) lors d'une activité dynamique. Une première étude a montré que sept muscles sur neuf peuvent être normalisés à partir de deux tâches maximales dynamiques, tandis que deux autres muscles doivent l'être avec la méthode traditionnelle isométrique. Cette procédure contribue à l'amélioration de la fiabilité de l'étude du membre supérieur tout en réduisant le temps de normalisation. D'autre part, l'étude de la relation entre activité EMG et vitesse de frappe en coup droit a permis de mettre en lumière les modifications d'amplitude EMG et des paramètres temporels d'activation de certains muscles du membre supérieur en réponse à l'augmentation de la vitesse de balle. Par ailleurs, une troisième étude a démontré que la fatigue générée par un exercice intense de tennis entraîne une baisse du niveau d'activation du grand pectoral et des muscles de l'avant-bras lors des frappes, sans toutefois entraîner de changement au niveau du timing d'activation. Cette diminution de l'activité EMG pourrait expliquer la dégradation de la performance relevée lors de cette expérience. Toutefois, des stratégies de protection de l'organisme et/ou de gestion du conflit vitesse-précision sont à envisager et ouvrent la voie à de futures étude
    • …
    corecore